71 research outputs found

    Shape Outlier Detection and Visualization for Functional Data: the Outliergram

    Get PDF
    We propose a new method to visualize and detect shape outliers in samples of curves. In functional data analysis we observe curves defined over a given real interval and shape outliers are those curves that exhibit a different shape from the rest of the sample. Whereas magnitude outliers, that is, curves that exhibit atypically high or low values at some points or across the whole interval, are in general easy to identify, shape outliers are often masked among the rest of the curves and thus difficult to detect. In this article we exploit the relation between two depths for functional data to help visualizing curves in terms of shape and to develop an algorithm for shape outlier detection. We illustrate the use of the visualization tool, the outliergram, through several examples and asses the performance of the algorithm on a simulation study. We apply them to the detection of outliers in a children growth dataset in which the girls sample is contaminated with boys curves and viceversa.Comment: 27 pages, 5 figure

    Dependency evolution in Spanish disabled population : a functional data analysis approach

    Get PDF
    In a health context dependency is defined as lack of autonomy in performing basic activities of daily living that require the care of another person or significant help. However, this contingency, if present, changes throughout the lifetime. In fact, empirical evidence shows that, once this situation occurs, it is almost impossible to return to the previous state and in most cases the intensity increases. In this article, the evolution of the intensity in this situation is studied for the Spanish population affected by this contingency. Evolution in dependency can be seen as sparsely observed functional data, where for each individual we get a curve only observed at those points in which changes in the condition of his/her dependency occur. We use functional data analysis techniques such as curve registration, functional data depth or distance-based clustering to analyse this kind of data. This approach proves to be useful in this context since it takes into account the dynamics of the dependency process and provides more meaningful conclusions than simple pointwise or multivariate analysis. The database analysed comes from the Survey about Disabilities, Personal Autonomy and Dependency Situations, EDAD 2008, (Spanish National Institute of Statistics, 2008). The evaluation of the dependency situation for each person is ruled in Spain by the Royal Decree 504/2007 that passes the scale for assessment of the situation set by Act 39/2006. In this article, the scale value for each individual included in EDAD 2008 has been calculated according to this legislation. Differences between sex, ages and first appearance time have been considered and prediction of future evolution of dependency is obtainedThe research of Ana Arribas-Gil was supported by grants ECO2011-25706 and MTM2010-17323 of the Spanish Ministry of Science and Innovatio

    Constructing a Children's Subjective Well-Being Index: an Application to Socially Vulnerable Spanish Children

    Get PDF
    It is well-known that traditional economic measures such as household income appear to play less of a role in explaining children's subjective well-being than adults'. This paper focuses on the construction of a children's well-being index taking into account subjective and emotional factors, such as children's experiences of material deprivation and bullying, the quality of family relationships and with peers, the quality of services in their neighbourhood and personal well-being. The index is constructed from principal component analysis and rescaled to 0-100% for better interpretation. Data comes from a survey run in Spain in 2016 by the largest humanitarian organization involved in social programs in the country, covering socially vulnerable children aged 8-11, with around 2,900 respondents. The main findings are: (i) bullying makes the difference between children being moderate or completely unsatisfied with their lives; (ii) there is no a single Spanish region reaching satisfying well-being levels across all the components of the index. The methodology proposed for the construction of the index is general enough to be applied to general child population, regardless their social vulnerability condition or even country, adapting the questionnaire appropriately.Financial support from research project MTM2014-56535-R by the Spanish Ministry of Economy and Competitiveness

    Dependence evolution in the Spanish disabled population: a functional data analysis approach

    Get PDF
    In a health context, dependence is defined as a lack of autonomy in performing the basic activities of daily living and requiring care giving or significant help from another person. However, this contingency, if present, changes over one's lifetime. Empirical evidence shows that, once this situation occurs, it is almost impossible to return to the previous state and in most cases increases in intensity. In the paper, the evolution of the intensity of this situation is studied for the Spanish population affected by this contingency. Evolution in dependence can be seen as sparsely observed functional data, where we obtain a curve for each individual that is observed at only those points where changes in his or her condition of dependence occur. We use functional data analysis techniques, such as curve registration, functional data depth and distance-based clustering, to analyse this type of data. This approach proves to be useful in this context because it considers the dynamics of the dependence process and provides more meaningful conclusions than simple pointwise or multivariate analysis. We use the sample statistics obtained to predict the future evolution of dependence. The database analysed originates from the ‘Survey on disability, personal autonomy and dependence situations’ in Spain in 2008. The survey is the largest and most complete survey to be made available in Europe for the study of disability. In addition, the Spanish legislation is one of the most recent in Europe and provides a detailed quantitative scale to assess dependence. In the paper, the scale value according to this legislation has been calculated for each individual included in the survey. Differences between sex, age and the time of first appearance were considered, and a prediction of the future evolution of dependence is obtained

    Classification of longitudinal data through a semiparametric mixed-effects model based on lasso-type estimators

    Get PDF
    We propose a classification method for longitudinal data. The Bayes classifier is classically used to determine a classification rule where the underlying density in each class needs to be well modeled and estimated. This work is motivated by a real dataset of hormone levels measured at the early stages of pregnancy that can be used to predict normal versus abnormal pregnancy outcomes. The proposed model, which is a semiparametric linear mixed-effects model (SLMM), is a particular case of the semiparametric nonlinear mixed-effects class of models (SNMM) in which finite dimensional (fixed effects and variance components) and infinite dimensional (an unknown function) parameters have to be estimated. In SNMM’s maximum likelihood estimation is performed iteratively alternating parametric and nonparametric procedures. However, if one can make the assumption that the random effects and the unknown function interact in a linear way, more efficient estimation methods can be used. Our contribution is the proposal of a unified estimation procedure based on a penalized EM-type algorithm. The Expectation and Maximization steps are explicit. In this latter step, the unknown function is estimated in a nonparametric fashion using a lasso-type procedure. A simulation study and an application on real data are performed.The authors are grateful to two anonymous referees and an Associate Editor for their insightful comments and valuable suggestions, which led to substantial improvements in the presentation of this work. Ana Arribas–Gil was supported by projects MTM2010-17323 and ECO2011-25706, Spain. Rolando de la Cruz was supported by project FONDECYT 1120739, grant ANILLO ACT–87, and grant FONDAP 15130011, Chile. Cristian Meza was supported by projects FONDECYT 11090024 and 1141256, and grant ANILLO ACT–1112, CONICYT-PIA, Chile

    Patient No-Show Prediction: A Systematic Literature Review

    Get PDF
    Nowadays, across the most important problems faced by health centers are those caused by the existence of patients who do not attend their appointments. Among others, these patients cause loss of revenue to the health centers and increase the patients’ waiting list. In order to tackle these problems, several scheduling systems have been developed. Many of them require predicting whether a patient will show up for an appointment. However, obtaining these estimates accurately is currently a challenging problem. In this work, a systematic review of the literature on predicting patient no-shows is conducted aiming at establishing the current state-of-the-art. Based on a systematic review following the PRISMA methodology, 50 articles were found and analyzed. Of these articles, 82% were published in the last 10 years and the most used technique was logistic regression. In addition, there is significant growth in the size of the databases used to build the classifiers. An important finding is that only two studies achieved an accuracy higher than the show rate. Moreover, a single study attained an area under the curve greater than the 0.9 value. These facts indicate the difficulty of this problem and the need for further research

    Depthgram: Visualizing outliers in high-dimensional functional data with application to fMRI data exploration.

    Get PDF
    Functional magnetic resonance imaging (fMRI) is a non-invasive technique that facilitates the study of brain activity by measuring changes in blood flow. Brain activity signals can be recorded during the alternate performance of given tasks, that is, task fMRI (tfMRI), or during resting-state, that is, resting-state fMRI (rsfMRI), as a measure of baseline brain activity. This contributes to the understanding of how the human brain is organized in functionally distinct subdivisions. fMRI experiments from high-resolution scans provide hundred of thousands of longitudinal signals for each individual, corresponding to brain activity measurements over each voxel of the brain along the duration of the experiment. In this context, we propose novel visualization techniques for high-dimensional functional data relying on depth-based notions that enable computationally efficient 2-dim representations of fMRI data, which elucidate sample composition, outlier presence, and individual variability. We believe that this previous step is crucial to any inferential approach willing to identify neuroscientific patterns across individuals, tasks, and brain regions. We present the proposed technique via an extensive simulation study, and demonstrate its application on a motor and language tfMRI experiment.Agencia Estatal de Investigación, Spain, Grant/Award Number: PID2019-109196GB-I00; Ministerio de Economía y Competitividad, Spain, Grant/Award Numbers: ECO2015-66593-P, MTM2014-56535-R.S
    corecore